19 



-continued 



20 



( i i ) MOLECULE TYPE: DNA (genomic) 
Cxi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
AGATCATCTC TGCCTG AGT A TCTT 



( 2 ) INFORMATION FOR SEQ ID NOJ: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 bases 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (genomic) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

CCACCCATGG CAAATTCCAT GGCA 



( 2 ) INFORMATION FOR SEQ ID NO:6: 

( i } SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 bases 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NES S : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (genomic) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:6: 



( 2 ) INFORMATION FOR SEQ ID NO:7: 

( i ) SEQUENCE CHARACTERISTICS: , 
( A ) LENGTH: 12 amino acids 
( B ) TYPE: amino acid 
"(~C ) STRAND ED NES Swingle " 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: proicin 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

Asp Asp lie Aid Pro Thr Val Leu Leu L y a Glu Arg 
1 3 10 



( 2 ) INFORMATION FOR SEQ ID NO:8: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 bases 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 

( D ) TOPOLOGY: linear \ 
( i i ) MOLECULE TYPE: DNA (genomic) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:S: 
CTGCGATGCT CGCCCGCGCC CTG 23 



( 2 ) INFORMATION FOR SEQ ID NO:9: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 bases 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (genomic) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:9: 
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CTTCTACAGT TCAGTCGAAC GTTC 

( 2 ) INFORMATION FOR SEQ ID NO: 10: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 604 amino adds 
( B ) TYPE: amino acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: protein 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N0:10: 

Mci Leu Ala Arg Ala Leu Leu Leu Cyi Ala Val Leu Ala Leo Scr His 
1 5 10 15 

Tbr Ala Asn Pro Cys Cys Scr His Pro Cys Gin Asn Arg Gly Val Cys 
2 0 2 5 3 0 

Mc i Scr Val Gly Phc Asp Gin Tyr Lys Cys Asp Cys Thr Arg Tar Gly 
3 5 4 0 4 5 

Pbc Tyr Gly Glu Asn Cys Scr Thr Pro Glu Phc Leu Thr Arg lie Lys 
5 0 5 5 6 0 

Leu Phc LcuLys Pro Thr Pro Asa Thr Val His Tyr lie Leu Thr His 
65 70 75 80 

Phc Lys Gly Phc Trp Asn Val Val Asn Asn lie Pro Phc Leu Arg Asn 
8 5 9 0 9 5 

Ala lie Mci Scr Tyr Val Leu Thr Scr Arg Scr His Leu lie Asp Scr 
10 0 10 5 110 

Pro Pro Thr Tyr Asn Ala Asp Tyr Gly Tyr Lys Scr Trp Glu Ala Phc 
J 1 5 12 0 12 5 

Scr Asa Leu Sor Tyr. Tyr Thr Arg Ala Leu Pro Pro Val Pro Asp Asp 
13 0 13 5 14 0 

C.y_s__P_r_o _T_h_r— P-r_o_,L-c-u— G-l-y- V-a-l— L-y-s -G-l-y— L-y-s -L y-j — G 1-n — Ire u~ *P r o — A _ s~p~~ ScT" 

1^5 15 0 155 160 

Asn Glu lie Val Glu Lys Leu. Leu Leu Arg Arg Lys Phc 11c Pro Asp 
16 5 17 0 17 5 

Pro Gin Gly Scr Asn Mci Met Phc Ala Phc Phc Ala Gin His Phc Thr 
18 0 18 5 19 0 

His Gin Phc Phc Lys Tbr Asp His Lys Arg Gly Pro Ala Phc Thr Asn 
195 200 205 

Gly Lei Gly His Gly Val Asp Leu Asn His lie Tyr Gly Glu Tbr Leu 
2 10 2 15 2 2 0 

Ala Arg Gin Arg Lys Leu Arg Leu Phc Lys Asp Gly Lys Mci Lys Tyr 
225 230 235 240 

Gin lie lie Asp Gly Glu Mel Tyr Pro Pro Thr Val Lys Asp Thr Gin 
245 250 255 

Ala Glu Mci lie Tyr Pro Pro Gin Val Pro Glu His Leu Arg Phc Ala 
260 265 270 

Val Oly Gla Glu Val Pbc Gly Leu Val Pro Gly Leu Mci Met Tyr Ala 
275 280 285 

Tbr lie Trp Leu Arg Glu His Asn Arg Val Cys Asp Vat Leu Lys Gin 
290 295 300 

Glu His Pro Glu Trp Gly Asp Glu Gin Leu Phc Gla Thr Scr Arg Leu 
305 310 315 320 

lie Leu lie Gly Glu Thr lie Lys lie Val lie Glu Asp Tyr Val Gin 
3 2 5 3 3 0 3 3 5 

His Leu Scr Gly Tyr His Phc Lys Leu Lys Phc Asp Pro Glu Leo Leu 
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340 345 350 

Phc Asn Lys Gin Phc Gin Tyr Gin Asn Arg lie Ala Ala Glu Phc Asn 
355 360 365 

Thr Leu Tyr His Trp His Pro Leu Leu Pro Asp Thr Phc Gin lie Hi* 
370 375 380 

Asp Gin Lys Tyr Asn Tyr Gin Gin Phc lie Tyr Asn Asn Scr Mc Leu 
385 390 395 400 

Leu Glu His CI y lie Thr Gin Phc ValGlu Scr Phc Thr Arg Gin Mc 
4 0 5 4 10 4 15 

Ala Gly Arg Va I Ala Gly Gly Arg Asn Val Pro Pro Ala Val Gin Lys 
420 425 430 

Va! Scr Gin Ala Scr lie Asp Gin Scr Arg Gin Met Lys Tyr Gin Scr 
435 440 445 

Phc Asn Glu Tyr Arg Lys Arg Phc Met Leu Lys Pro Tyr Glu Scr Phc 
450 455 460 

Glu Glu Leu Thr Gly Glu Lys Glu Met Scr Ala Glu Leu Glu Ala Leu 
465 470 475 480 

Tyr Gly Asp lie Asp Ala Val Glu Leu Tyr Pro Ala Leu Leu Val Glu 
485 490 495 

Lys Pro Arg Pro Asp Ala lie Phc Gly Glu Thr Met Val Glu Val Gly 
500 505 510 

Ala Pro Phc Scr Leu Lys Gly Leu Met Gly Asn Vat lie Cys Scr Pro 
515 520 525 

Ala Tyr Trp Lys Pro Scr Thr Phc Gly Gly Glu Val Gly Phc Gin lie 
530 535 540 

11c Asn Thr Ala Scr lie Gin Scr Leu lie Cys Asn Asn Val Lys Gly 
545 550 555 560 

Cys Pro Phc Thr Scr Phc Scr Val Pro AspPro Glu Leu lie Lys Thr 
565 570 575 



ValThr lie Asn Ala Scr Scr ScrArg Scr Gly Leu Asp Asp lie Aan 
580 585 590 

Pro Thr Val Leu Leu Lys Glu Arg Scr Thr Glu Leu 
5 9 5 6 0 0 

( 2 ) INFORMATION FOR SEQ DD NO-.ll: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 3387 bases 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (genomic) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

GTCCAGGAAC TCCTCAGCAG CGCCTCCTTC AGCTCCACAG CCAGACGCCC TCAGACAGCA 60 

AAGCCTACCC CCGCGCCGCG CCCTGCCCGC CGCTGCGATG CTCGCCCGCG CCCTGCTGCT 120 

GTGCGCGGTC CTGG CGCTCA GCCATACAGC A AATCCTTGC TGTTCCCACC CATGTCAAAA 180 

CCG AGGTGT A TGTATGAGTG TGGGATTTGA CCAGTATAAG TGCGATTGTA CCCGG AC AGG 240 

AT TCT ATGG A GAAAACTGCT CAACACCGGA ATTTTTGACA AGAATAAAAT TATTTCTGAA 300 

ACCCACTCCA AACACAGTGC ACTACATACT TACCCACTTC AAGGGATTTT GGAACGTTGT 360 

GAATAACATT CCCTTCCTTC GAAATGCAAT TATGAGTTAT GTGTTGACAT CCAGATC ACA 420 

TTTGATTGAC AGTCCACCAA CTTACAATGC TGACTATGGC TACAAAAGCT GGGAAGCCTT 480 

CTCTAACCTC TCCTATTATA CTAGAGCCCT TCCTCCTGTG CCTGATGATT GCCCGACTCC 540 
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CTTGGGTGTC 
TCT AAG A AG A 
CCAGCACTTC 
CGGGCTGGGC 
T A AACTGCGC 
TCCTCCC ACA 
TCTACGGTTT 
C ACAATCTGG 
ATGGGGTGAT 
G ATTGTGATT 
CCCAG A ACTA 
CACCCTCTAT 
CAACTATCAA 
TGTTGA ATC A 
CGCAGT AC AG 
TTTTAATGAG 
AGGAG A A A AG 
GCTGTATCCT 
GGT AG A AGTT 
TGCCTACTGG 
CTCA ATTCAG 
-TC C A G A T-C C A- 
AG ATG ATATC 
TGATC AT ATT 
ATATTAA ACT 
GAGA AAGG AG 
TGTT AAGTTT 
CGTCTTTTTA 
TT AAAC ACT A 
GCATCTTCC A 
TTTTTCTGTC 
C ATTACCAGT 
TCT AA A TTCA 
CTTGTAC AT A 
TT ACT AC A AT 
AACCTTTTTA 
TGGTGG AGCC 
CTGTTTATAT 
TTGAACAT AA 
TTAAACTTTT 



A A AGGT A AAA 
A AGTTCATCC 
ACGC ACCAGT 
C ATGGGGTGG 
C TTTTC A AGG 
GTCA AAGAT A 
GCTGTG GGGC 
CTGCGGG AAC 
G AGC AGTTGT 
G A AG ATT ATG 
CTTTTC AACA 
C ACTGGCATC 
C AGTTT ATCT 
TTCACC AGGC 
A A AGTAT CAC 
T ACCGC AAAC 
G AAATGTCTG 
GCCCTTCTGG 
GGAGC ACCAT 
A AGCCAAGC A 
TCTCTCATCT 
-GAGC-TCA T-T-A- 
A ATCCCACAG 
TATTTATTTA 
CCTT ATGTTA 
TC ATACTTGT 
GG AAA AC AGT 
CTTG AATTTC 
TC ACAAGATG 
TG ATGCATTA 
AT C AAAC AAA 
AATTTC ATGT 
T AGGGT AGA A 
T ACC AA AAAG 
TGCTTGTTA A 
GTGTGACTGT 
ACTGCAGTGT 
GGCTGGT AAC 
AGCA ATA ACC 
TGAAGC AAAC 



AGCAGCTTCC 
CTGATCCCC A 
TTTTCA AG AC 

ACTTAAATC A 
ATGGAAAAAT 
CTC AGGC AGA 
AGG AGGT C TT 
AC AAC AG AGT 
TCCAGACAAG 
TGCAACACTT 
A AC A ATTCC A 
CCCTTCTGCC 
AC AACAACTC 
AAA TTGC TGG 
AGGCTTCCAT 
GCTTT ATGCT 
CAGAGTTGGA 
T AGA A AAGCC 
TCTCCTTG A A 
CTTTTGGTGG 
GCAATAACGT 

-A A A C A G-T-G A G- 
TACTACT AA A 
TATGA ACCAT 
CTTAAC ATCT 
GAAGACTTTT 
TTTTATTCTG 
A ACTT AT ATT 
CCAAAATGCT 
GAAGTAACTA 
AC AGGT ATC A 
CTACTTTTT A 
TC ACCTGT AA 
A AGCTGTCTT 
A AT ATTTTAT 
TA AAACTTCC 
TATCTC A A A A 
ATGTAAA AAC 
AAAGG AGAA A 
TTTTTTTT AG 



TGATTC AA AT 
GGGCTC A A AC 
AGATCATAAG 
T ATTT ACGGT 
G A A AT ATCAG 
G ATGATCTAC 
TGOTCTGGTG 
ATGTGATGTG 
C AGGCT AAT A 
GAGTGGCT AT 
GTACC A AA AT 
TOACACCTTT 
TATATTGCTG 
C AGGGTTGCT 
TGACC AG AGC 
GAAGCCCTAT 
AGCACTCTAT 
TCGGCC AG AT 
AGG ACT TATG 
AGA AGT GGGT 
G AAGGGCTGT 
-G-A T C A A-T G C-A- 
AGA AC GGTCG 
GTCTATTAAT 
TCTGT AACAG 
ATGTC ACTAC 
TTTTAT AAAC 
ATA AGGACGA 
G A AAGTTTTT 
ATGTTTGA A A 
GTGCATTATT 
A AATC AGCA A 
AAGCTTGTTT 
GGATTT AAAT 
A AGTG ATGTT 
TTTTA A ATC A 
T AAGA AT ATC 
CCCAT A ACCC 
AGCCC A AATT 
C CTTGTGC AC 



GAGATTGTGG 
ATG ATGTTTG 
CGAGGGCC AG 
GAA ACTCTGG 
AT A ATTG A TG 
CCTCCTCA AG 
CCTGGTCTGA 
CTT A A AC A GG 
CTGA TAGGAG 
CACTTCA A AC 
CGTATTGCTG 
CAAATTC A TG 
GAACATGG A A 
GGTGGTAGGA 
AGGC AGATG A 
GAATCATTTG 
GGTGACATCG 
GCC ATC TT TG 
GGTAATGTTA 
TTTC A AATC A 
CCCTTT ACTT 
-AG-T-TCTTCCe- 
ACTGA ACTGT 
TTA ATT ATTT 
AAGTCAGT AC 
TCTAAAG ATT 
CAG AGAG A A A 
AAGTAA AG AT 
ACACTGTCGA 
TTTTA AAGTA 
AAATG AA TAT 
TGA A AC AAT A 
GATTTCTT A A 
CTGTA AAATC 
CCTTTTTC AC 
A AA TG C C A A A 
CTGTTGAG AT 
CGCC A AA AGG 
ATTGGTTCCA 
TGCAGACCTG 



A AA AATTGCT 
CATTCTTTGC 
CTTTCACCAA 
CT AGAC AGCG 
GAG AG ATGT A 
TCCCTG AGC A 
TGATGT ATGC 
AG C AT C C TGA 
AGACT ATTAA 
TGAA ATTTGA 
CTGAATTTAA 
ACCAGAAATA 
TT ACCC AGTT 
ATGTTCCACC 
A ATACC AGTC 
A AGA ACT TAC 
ATGCTGTGG A 
GTGAA AC CAT 
T ATGT T CTCC 
TCAACACTGC 
CATTC AGTGT 
GCTCCGG A C T- 
AG A AGTCTA A 
AATAATATTT 
TCCTGTTGCG 
TTGCTG TTGC 
TG AG T TT TG A 
GTTTGAATAC 
TGTTTCC A AT 
CTTTTGGGT A 
TT AAAT TAG A 
ATTTGAA ATT 
AGTT ATTAAA 
AG ATG A A ATT 
CAAGAG T AT A 
TTT ATTA AGG 
ATTCC AG A AT 
GGTCCT ACCC 
A ATTTAGGGT 
GTACTC AGAT 



6 0 0 

6 6 0 

7 2 0 

7 8 0 

8 4 0 

9 0 0 
9 6 0 

10 2 0 
10 8 0 
114 0 
12 0 0 

12 6 0 

13 2 0 

13 8 0 

14 4 0 

1 5 0 0 

15 6 0 

16 2 0 

16 8 0 

17 4 0 

18 0 0 

"I- 8 6 0 

19 2 0 
19 8 0 

2 0 4 0 
2 10 0 
2 16 0 
2 2 2 0 
2 2 8 0 
2 3 4 0 
2 4 0 0 
2 4 6 0 
2 5 2 0 
2 5 8 0 
2 6 4 0 
2 7 0 0 
2 7 6 0 
2 8 2 0 
2 8 8 0 
2 9 4 0 
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TTTGCT ATG A 


GGTTAATGA A 


GTACCAAGCT 


GTGCTTGAAT 


AACGATATGT 


I I 4 V. 1 k> A U A i 


i n n n 
i V u u 


TTTCTGTTGT 


ACAGTTTAAT 


TT AGCAGTCC 


AT ATCAC ATT 


GC A A A AGTAG 


CAATGACCTC 


3 0 6 0 


ATAAA A TACC 


TCTTC AAA AT 


GCTTAAATTC 


ATTTC ACAC A 


TT AATTTTAT 


CTCAGTCTTG 


3 12 0 


A AGCCAATTC 


AO T AGGTOC A 


TTGGAATCAA 


OCCTGGCTAC 


CTGCATGCTG 


TTCCTTTTCT 


3 18 0 


TTTCTTCTTT 


TAGCC ATTTT 


GCTAAGAG AC 


ACAGTCTTCT 


CA AACACTTC 


OTTTCTCCTA 


3 2 4 0 


TTTTGTTTT A 


C TAGTTTTA A 


G ATC AG AGTT 


CACTTTCTTT 


GG ACTCTGCC 


TATATTTTCT 


3 3 0 0 


TACCTGA ACT 


TTTGCAAGTT 


T TC AGGTAA A 


CCTC AGCTC A 


GG ACTGCTAT 


TTAGCTCCTC 


3 3 6 0 


TT A AG A AG AT 


T A AAA AAA A A 


A A A A A AG 








3 3 8 7 



( 2 ) INFORMATION FOR SEQ ID NO:12: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 bases 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
C D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (genomic) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTTCCTTCC AAATGCAATT A 



< 2 ) INFORMATION FOR SEQ ID NO:13: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 bases 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (genomic) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
AAACTGATGC G TG A AGTGCT G 



( 2 ) INFORMATION FOR SEQ ID NO: 14; 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 bases 
( B ■) TYPE: nucleic acid 
( C } STRANDED NESS: single 
( D ) TOPOLOGY: lincir 

( i i ) MOLECULE TYPE: DNA (genomic) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N0:14: 



GAGATTGTGG G AAA ATTOCT T 



